A hierarchical approach to improve job scheduling and data replication in data grid
نویسندگان
چکیده
In dynamic environment of data grid effective job scheduling methods consider location of required data in dispatching jobs to resources. Also, job scheduling methods are combined with data replication mechanisms to reduce remote data access as well as save network bandwidth. In this paper, we combine job scheduling method and dynamic data replication to reduce data access delay and job execution time. Also, we expand our work by applying bloom filter in job scheduling decision. In data grid, appropriate mechanisms for recording, deleting and inquiring information about data files are required for implementing proper job scheduling method. Therefore, we apply counting bloom filter for recording/deleting and inquiring information about data files in Replica Catalogue (RC). Result of simulation indicates that proposed job scheduling and data replication methods reduce job execution time, also using bloom filter saves network bandwidth and reduces time of gathering information for selecting appropriate resources in job scheduling.
منابع مشابه
An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity
The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...
متن کاملImproving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy
Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...
متن کاملData Replication-Based Scheduling in Cloud Computing Environment
Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...
متن کاملJob Scheduling and Data Replication in Hierarchical Data Grid
Data Grid environment is a geographically distributed that deal with date-intensive application in scientific and enterprise computing. In data-intensive applications data transfer is a primary cause of job execution delay. Data access time depends on bandwidth, especially when hierarchy of bandwidth appears in network. Effective job scheduling can reduce data transfer time by considering hiera...
متن کاملImproving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. Arab J. Inf. Technol.
دوره 12 شماره
صفحات -
تاریخ انتشار 2015